Carlos Rios, Universidad de Buenos Aires,
cm.rios.10@gmail.com PRIMARY
Ingrid Vargas, Universidad de Buenos Aires,
igrdvargas@gmail.com
Diego Edwards, Universidad de Buenos Aires,
edwards_diego@yahoo.com.ar
Student Team: YES
Tableu 7
Oracle 11g
Spss Estatistics 19
Video:
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe?
Process:
In order to analyze the network health, it was
done the following steps:
7
The
variable TIPOBUSINESS was created to group the types of BUSINESSUNIT. TIPOBU has three
types:
HEADQUARTER: It groups all
Headquarters.
LARGE REGION: It groups Regions from
1 to 10.
SMALL REGION: It groups Regions from
11 to 50.
7
The
following graphs were produced:
1.
A Heat Map graphic, whose size of the squares is
according the number of machines in a certain POLICYSTATUS and TIPOBU. The
graph4s color represents the types of POLICYSTATUS.
2.
The bar
chart to show the distribution of the POLICYSTATUS according the class and
functionality of the machine. The graph4s color represents the types of
POLICYSTATUS.
3.
A heat map to show the POLICYSTATUS4s distribution in
each region (BUSINESSUNIT). The graph4s color represents the types of
POLICYSTATUS.
Development
Figure 1
The heat map
shows a great percentage of machines in healthy state. HEADQUARTER and SMALL
REGION have more than 90% in POLICYSTATUS=1. In LARGE REGION there is a
significant percentages in POLICYSTATUS=2. For the others states of
POLICYSTATUS (3, 4, and 5), the heat map shows low percentages (lower than 1%).
Figure 2
The bar
chart shows that according the class and function machine, there is a great
amount of machines in healthy state. The POLICYSTATUS=2 has a significant
percentage of machines. The rest of states are not viewed easily due to the
fact of their low percentages.
Figure 3
The heat map
shows the distribution of the variable POLICYSTATUS in each region (1 to 50 and
the HEADQUARTER). It is visualized that in both regions 5 and 10, there are not
any machines in healthy state. The squares of the heat map have a big size for
the POLICYSTATUS=2. Thus, those regions are considered as areas of concern and
alert.
MC 1.2 Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?
Anomaly 1: High number
of connections in non- working hours.
Location: Region 10,
ClassMachine: Workstation
FunctionMachine: Teller
Process:
First of all, the VARIABLE HEALTHTIME_ZONA was
created to use the corresponding time zone according the information given by
the variable LONGITUDE. The HEALTHTIME_ZONE shows that the measurement period
corresponds to the 22:15 hours on February 1 until 05:00 o'clock on February
4.
Afterwards, the variable PERIODO, which was
created since HEALTHTIME_ZONA, represents the periods of the day.
DAWN: (00:00 06:59)
MORNING: (07:00 11:59)
AFTERNOON: (12:00 17:59)
NIGHT: (18:00 23:59)
Also, the
business rules were taken into account. It means, that the business hours are
considered to be Monday-Friday 7:00 18:00 in each time zone.
The number
of connections was analyzed in each region taking into account both the class
and functionality of the machine and the non-working hours (NIGHT and DAWN).
The anomaly
was detected when it was produced a line graph, where each line represents a
REGION. The time is in the axis X and the number of connections is in the axis
Y.
Figure 4
It was
observed on February 3 in the DAWN period a high number of connections for machines
WORKSTATIONS TELLER in the REGION-10. For that reason, a heat map was
produced in order to analyze in detail that region.
Figure 5
By the size
of the squares of the heat map, it was viewed that the WORKSTATIONS-TELLER,
which belongs to the HEADQUARTERS and BRANCHES in the REGION-10, has a great
amount of connections in the DAWN period (day 3) in comparison with the same
period on February 2 and February 4.
Figure 6
A line graph was used to see the increment in
the number of connections. The graph shown that the number of
connections began to increase about at 1:00 oclock. That was evident
because of that the headquarter, which having a number
of connection lower than 2000 (at 1:00 o'clock), achieved almost 16000
connections. In the case of the branches, most of them
achieved number of connections lower than 2000. The anomaly ended at 5:00
hours, where the number of connections began to decrease.
Anomaly 2: A great
number of machines without activity during working hours.
Location: Region-25
(SMALL REGION).
Figure 7
Process:
The table contains information about the number
of active machines during the days 2 and 3 in working hours (07:00 18:00) and
in the night period. It was viewed that in the day 2, since 11:00 o'clock, the
number of machines decreased significantly in the REGION-25. Comparing this
behavior with the day 3, it can be seen that the number of active machines is
similar in each working hour.
To visualize this anomaly, it was used a matrix
where the columns represents the days and hours (07:00-23:00) and the rows
shows the branches. Every cell contains a pie chart which shows the
distribution of the ACTIVITYFLAG in that moment.
It was viewed that all machines that belongs to
a great number of branches in REGION-25 were without activity. It began at 11:00
oclock and last more than 9 hours.
The ACTIVITYFLAG was reviewed in order to find
the value 2, which means that the machine has maintenance activities; however,
it did not appear. Therefore, it represents an unusual behavior due to the fact
that most of the branches shown a great percentage of ACTIVITYFLAG=1. Besides,
when the day 3 was verified, the machines were actives during the working hours.
Figure 8